翻訳と辞書
Words near each other
・ Unidade real de valor
・ Unidale Mall
・ Unidan
・ Unidare RFC
・ Uniden
・ Uniden LPGA Invitational
・ Unidentified
・ Unicode and HTML for the Hebrew alphabet
・ Unicode anomaly
・ Unicode block
・ Unicode character property
・ Unicode collation algorithm
・ Unicode compatibility characters
・ Unicode Consortium
・ Unicode control characters
Unicode equivalence
・ Unicode font
・ Unicode in Microsoft Windows
・ Unicode input
・ Unicode subscripts and superscripts
・ Unicode symbols
・ Unicode Technical Standard
・ Unicoherent space
・ Unicoi
・ Unicoi County, Tennessee
・ Unicoi Mountains
・ Unicoi State Park
・ Unicoi, Tennessee
・ Unicolor cribo
・ Unicolored antwren


Dictionary Lists
翻訳と辞書 辞書検索 [ 開発暫定版 ]
スポンサード リンク

Unicode equivalence : ウィキペディア英語版
Unicode equivalence

Unicode equivalence is the specification by the Unicode character encoding standard that some sequences of code points represent essentially the same character. This feature was introduced in the standard to allow compatibility with preexisting standard character sets, which often included similar or identical characters.
Unicode provides two such notions, canonical equivalence and compatibility:
Code point sequences that are defined as canonically equivalent are assumed to have the same appearance and meaning when printed or displayed. For example, the code point U+006E (the Latin lowercase "n") followed by U+0303 (the combining tilde "◌̃") is defined by Unicode to be canonically equivalent to the single code point U+00F1 (the lowercase letter "ñ" of the Spanish alphabet). Therefore, those sequences should be displayed in the same manner, should be treated in the same way by applications such as alphabetizing names or searching, and may be substituted for each other. Similarly, each Hangul syllable block that is encoded as a single character may be equivalent encoded as a combination of a leading conjoining jamo, a vowel conjoining jamo, and, if appropriate, a trailing conjoining jamo.
Sequences that are defined as compatible are assumed to have possibly distinct appearances, but the same meaning in some contexts. Thus, for example, the code point U+FB00 (the typographic ligature "ff") is defined to be compatible—but not canonically equivalent—to the sequence U+0066 U+0066 (two Latin "f" letters). Compatible sequences may be treated the same way in some applications (such as sorting and indexing), but not in others; and may be substituted for each other in some situations, but not in others. Sequences that are canonically equivalent are also compatible, but the opposite is not necessarily true.
The standard also defines a text normalization procedure, called Unicode normalization, that replaces equivalent sequences of characters so that any two texts that are equivalent will be reduced to the same sequence of code points, called the normalization form or normal form of the original text. For each of the two equivalence notions, Unicode defines two normal forms, one fully composed (where multiple code points are replaced by single points whenever possible), and one fully decomposed (where single points are split into multiple ones). Each of these four normal forms can be used in text processing.
==Sources of equivalence==


抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)
ウィキペディアで「Unicode equivalence」の詳細全文を読む



スポンサード リンク
翻訳と辞書 : 翻訳のためのインターネットリソース

Copyright(C) kotoba.ne.jp 1997-2016. All Rights Reserved.